04:00
2026-06-05
arxiv.org
machine-learning
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
A new study shows that discrete Gradient Descent (GD) with a large step size restores symmetry in multi-pathway Deep Linear Networks, counteracting the "winner-takes-all" specialization predicted by Gโฆ